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Abstract — In contrast to the prevalent assumption of rich multipath 
in information theoretic analysis of wireless channels, physical channels 
exhibit sparse multipath, especially at large bandwidths. We propose 
a model for sparse multipath fading channels and present results on 
the impact of sparsity on non-coherent capacity and rehabihty in the 
wideband regime. A key implication of sparsity is that the statistically 
independent degrees of freedom in the channel, that represent the delay- 
Doppler diversity afforded by multipath, scale at a sub-linear rate with 
the signal space dimension (time-bandwidth product). Our analysis is 
based on a training-based communication scheme that uses short-time 
Fourier (STF) signaling waveforms. Sparsity in delay-Doppler manifests 
itself as time-frequency coherence in the STF domain. From a capacity 
perspective, sparse channels are asymptotically coherent: the gap between 
coherent and non-coherent extremes vanishes in the limit of large signal 
space dimension without the need for peaky signaling. From a reUabiUty 
viewpoint, there is a fundamental tradeoff between channel diversity and 
learnability that can be optimized to maximize the error exponent at any 
rate by appropriately choosing the signaling duration as a function of 
bandwidth. 



coefficients represent the DoF in delay and Doppler. Sparse channels 
correspond to a sparse set of virtual coefficients and a key implication 
is the sub-linear scaling of the number of independent degrees of 
freedom (DoF) with signal space dimensions. This is contrast to 
rich multipath, where the DoF scale linearly. We consider signaling 
over orthogonal short-time Fourier (STF) basis functions that serve 
as approximate eigenfunctions for underspread channels and provide 
a natural mechanism to relate sparsity in delay-Doppler to coherence 
in time-frequency. 

With no receiver CSI a priori, we consider training-based com- 
munication schemes and investigate the wideband ergodic capacity 
under the assumption that the time-frequency coherence dimension 
Nc scales with SNR according to 
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I. Introduction 

Recent advances in the emerging areas of ultra-wideband com- 
munication systems and wireless sensor networks have renewed the 
search for a complete understanding of the fundamental performance 
limits in the wideband/low SNR regime. The impact of multipath 
signal propagation, which leads to fading, on the capacity and 
reliability of wideband channels, depends critically on knowledge 
of the channel state information (CSI) at the receiver The seminal 
work in [1] best illustrates this: with perfect CSI at the receiver, 
peak-power limited QPSK achieves second-order optimality, whereas 
with no receiver CSI, peaky signals are necessary to even achieve 
first-order optimality, although they fail to be second-order optimal. 
Motivated by the fact that channel learning can bridge this gap and the 
sharp cut-off, the authors in [2] study wideband capacity by assuming 
a coherence time scaling with SNR of the form 
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However there is no explanation for why such scaling laws should 
hold in practice. 

Accurate modeling of the channel characteristics in time and 
frequency, as a function of physical multipath characteristics, is 
critical in analyzing the performance of channel learning schemes 
and the impact of CSI on the performance limits. While most existing 
results assume rich multipath, there is growing experimental evidence 
(e.g. [3], [4]) that physical channels exhibit a sparse structure at 
wide bandwidths and when we code over long signaling durations. In 
this paper, we use a virtual representation [5] for physical multipath 
channels to present a framework for modeling sparsity. The virtual 
representation uniformly samples multipath in delay and Doppler at a 
resolution commensurate with the signaling bandwidth and signaling 
duration, respectively. Under this representation, the virtual channel 
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It is observed that the coherence requirements for achieving capacity 
are shared between both time and frequency: the coherence band- 
width, Wcoh, increases with bandwidth, W (due to sparsity in delay), 
and the coherence time, Tcoh, increases with signaling duration T 
(due to sparsity in Doppler). As a result, the scaling requirements on 
Tcoh with W needed in [2] for first- and second-order optimality 
are replaced by scaling requirements on Nc = TcohWcoh- This 
leads to dramatically relaxed requirements on Tcoh scaling with 
bandwidth/SNR compared to those assumed in [2]. In particular, 
sparse multipath channels are asymptotically coherent; that is, for 
a sufficiently large but fixed bandwidth, the conditions for first- 
and second-order optimality can be achieved by simply making the 
signaling duration sufficiently large according to 
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Equation ([S} relates the signaling parameters {T,W ,P), as a function 
of the channel parameters {Tm,Wd,5i,52) in order for the relationship 
© to hold between Nc and SNR at any desired value of /i (in 
particular, /i > 1 for first-order optimality and /i > 3 for second- 
order optimality). The asymptotic coherence of sparse channels also 
eliminates the need for peaky signaling that has been emphasized 
in existing results [6], [1] for increasing the spectral efficiency of 
non-coherent communication schemes. 

Our investigation of the reliability of sparse channels is through 
random coding error exponents [7]. For training-based communica- 
tion schemes, our results reveal a fundamental learnability versus 
diversity tradeoff in sparse channels. At any transmission rate less 
than the coherent capacity, there is an optimal choice of signal 
parameters (as a function of channel parameters) that optimizes the 
tradeoff and yields the largest error exponent. 
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II. System Setup 
A. Sparse Multipath Channel Modeling 

A physical discrete multipath channel can be modeled as 

h{T,v) = PnSjT - Tn)5{v - V„) 

n 



(4) 



where /i(r, u) is the delay-Doppler spreading function of the channel, 
G [0,Tm] and G [-Wd/2, Wd/2] denote the complex 
path gain, delay and Doppler shift associated with the n-th path. 
T,n and Wd are the delay and Doppler spreads respectively and 
w{t) is additive white Gaussian noise (AWGN). We assume a 
sufficiently underspread channel, TmWd <C 1. In this paper we 
use a virtual representation [5], [8] for time- and frequency-selective 
multipath channels that captures the channel characteristics in terms 
of resolvable paths and greatly facilitates system analysis from 
a communication-theoretic perspective. The virtual representation 
uniformly samples the multipath in delay and Doppler at a resolution 
commensurate with signaling bandwidth W and signaling duration 
T, respectively [5], [8] 

y^^) = E E - we^''™*^^ (5) 

he,m « E 

The sampled representation ([5} is linear and is characterized by the 
virtual delay-Doppler channel coefficients Each con- 

sists of the sum of gains of all paths whose delays and Doppler shifts 
lie within the (^,m)-th delay-Doppler resolution bin as shown in 
Fig.[TJa). Distinct hi,m's correspond to approximately disjoint subsets 
of paths and are hence approximately statistically independent (due 
to independent path gains and phases). In this work, we assume that 
the channel coefficients {he,m} are perfectly independent. We also 
assume Rayleigh fading in which are zero-mean Gaussian 

random variables and the channel statistics are thus characterized by 
the power in the virtual channel coefficients 

*(^,m) = E[\hl,rr,\^] (7) 

which is a measure of the (sampled) delay-Doppler power spectrum. 

Let D denote the number of dominant [] non-zero channel coeffi- 
cients. The parameter D reflects the statistically independent degrees 
of freedom (DoF) in the channel and also signifies the delay-Doppler 
diversity afforded by the channel. It can be bounded as 



D — DtDw < Dniax = DT,maxDw,niax 

Dt ,max — \TWd\ , Dw. max — 



(8) 



where DT,m!ix denotes the maximum number of resolvable paths in 
Doppler (maximum Doppler or time diversity) and Dw^max denotes 
maximum number of resolvable paths in delay (maximum delay 
or frequency diversity). In rich multipath, Dt = -Dr.max and 
-Diy — -Dw.max and each delay-Doppler resolution bin in Fig. [TJa) 
is populated by a path. In this case D scales linearly with the signal 
space dimensions, A'^ — TW. 

However, recent measurement campaigns [3], [4] for UWB chan- 
nels show that dominant channel coefficients get sparser in the delay 
domain as the bandwidth increases. As we consider large bandwidths 
and/or long signaling durations, the resolution of paths in both delay 

'For which '^/{l, m) > 7 for some prescribed threshold 7 > 0. 
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Fig. 1. (a) Delay-doppler sampling commensurate with signaling duration 
and bandwidth, (b) Time-frequency coherence subspaces in STF signaling, (c) 
Illustration of the training-based communication scheme in the STF domain. 
One dimension in each coherence subspace (dark squares) represent the training 
dimension and the remaining dimensions are used for communication. 



and Doppler domains gets finer, leading to the scenario in Fig. [TJa) 
where the delay-Doppler resolution bins are sparsely populated with 
paths, i.e. D < Dmax- Thus physical multipath channels get sparser 
with increasing W due to fewer than Uvi^^max resolvable delays and 
with increasing T due to fewer than DT,max resolvable Doppler 
shifts. We model such sparse behavior with a sub-linear scaling in 
Dt and Dw with T and W: 



Di 



{TWdf' 



Dw ~ [Trr^Wr , SuSie [0,1] (9) 



where {Si} represent channel sparsity; smaller the value of {Si}, the 
slower (sparser) the growth in the resolvable paths in the correspond- 
ing domain. This implies that the delay-Doppler DoF, D — DtDw, 
scale sub-linearly with the number of signal space dimensions A'^. 
Note that with perfect CSI at the receiver, D reflects the delay- 
Doppler diversity afforded by the channel, whereas with no CSI, it 
reflects channel uncertainty. 

B. Orthogonal Short-Time Fourier Signaling 

We consider signaling using an orthonormal short-time Fourier 
(STF) basis [9], [10] that is a natural generalization of orthogonal 
frequency-division multiplexing (OFDM) for time- varying channels. 
An orthogonal STF basis for the signal space is generated from a 
fixed prototype waveform g(t) via time and frequency shifts: 



J2TTWat 



0, 
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where ToWo^l 
1, m = 0, . . . , A^w- - 1 



T W 
Nt = —,Nw = TTj- wtd N = NtNw 

J- o Wo 



(10) 



The N transmitted symbols xe.m are modulated onto the STF basis 
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For a signaling duration T and bandwidtii W, the basis functions 
span the signal space with dimension equal to A'^ = TW . 
The received signal is given by 

r{t) = n {x{t))+w{t) 

The received signal is projected onto the STF basis waveforms to 
yield the received symbols 

rt,m^ {r,<t>e,rn) ^ J2 l' , m' V , m' + 

t' ,m' 

Equivalently, we can represent the system in STF-domain using an 
TV-dimensional matrix system equation 

r = VSNR Hx + w (12) 

where w represents the additive noise vector whose entries are i.i.d. 
CN(0, 1). The N X N matrix consists of the channel coefficients 
i^i m i' m' } CD- The parameter SNR represents the transmit 
energy per modulated symbol and for a given transmit power P 
equals SNR — (E[|x|^] = 1). In this work, our focus is on 
the wideband regime, where SNR ^ 0. 

For sufficiently underspread channels, the parameters To and Wo 
can be matched to Tm and Wd so that the STF basis waveforms 
serve as approximate eigenfunctions of the channel [10], [9]. Thus 
the N X N channel matrix H is approximately diagonal. In this work, 
we will assume that H is exactly diagonal, that is. 



H = diag • • • /ii.jVc, h 
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(13) 

Furthermore, the diagonal entries of H in l |13t admit an intuitive 
block fading interpretation in terms of time-frequency coherence 
subspaces [9] illustrated in Fig. [TJb). The signal space is partitioned 
as 

iV = TW = NcD (14) 

where D represents the number of statistically independent time- 
frequency coherence subspaces (delay-Doppler diversity), reflecting 
the DoF in the channel (see and A'^c represents the dimension 
of each coherence subspace, which we will refer to as the time- 
frequency coherence dimension. In the block fading model in 
il3l . the channel coefficients over the i-th coherence subspace 
fei.i ■ • ■ hi,Na are assumed to be identical, hi, whereas the coefficients 
across different coherence subspaces are independent and due to 
the stationarity of the channel statistics across time and frequency, 
identically distributed. Thus, the D distinct STF channel coefficients, 
{hi}, are i.i.d. zero-mean Gaussian random variables (Rayleigh 
fading) with variance E[|h,ip] = 1. 

Using the DoF scaling for sparse channels in the coherence 
dimension of each coherence subspace can be computed as 

T T^-^i „, W W^' ' 



T 



No = ToohWooh 



Wooh = 
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(15) 



where Tcoh is the coherence time and Wcoh is the coherence 
bandwidth of the channel, as illustrated in Fig. [TJb). Note that 
<5i = (^2 = 1 corresponds to a rich multipath channel in which 
No ~ A^cmin = 1/ (TmWd) IS Constant and D — Dmax increases 
Unearly with A'^ = TW . This is the assumption prevalent in existing 
works. In contrast, for sparse charmels, {5i , ^2) G (0, 1), and both No 
and D increase sub-linearly with A'^. In terms of channel parameters. 
No increases with decreasing TmWd as well as with smaller Si. In 
terms of signaling parameters. No can be increased by increasing T 



and/or W. On the other hand, when the channel is rich. No depends 
only on TmWd and does not scale with T or W. 

In this paper, our focus is on computing the sparse channel capacity 
and reliability and, as we will see later, both metrics turn out to 
be functions only of the parameters No and SNR. Furthermore, in 
the wideband limit they critically depend on the following relation 
between No and SNR 

A^c = , > (17) 



SNRf 



where fc > is a constant. 



C. Training-Based Communication Using STF Signaling 

We use the block fading model induced by STF signaling to 
study the impact of time-frequency coherence on channel capacity 
and reliability in sparse multipath channels. Within the non-coherent 
regime, we focus our attention on a communication scheme in which 
the transmitted signals include training symbols to enable coherent 
detection. Although it is argued in [2] that training-based schemes are 
sub-optimal from a capacity point of view, the restriction to training 
schemes is motivated by practical considerations. 

We provide an outline of the training-based communication 
scheme, adapted from [2], suitable to STF signaling (see [11] for 
details). The total energy available for training and communication 
is PT, of which a fraction 77 is used for training and the remaining 
fraction (1 — rj) is used for communication. Since the quality of the 
channel estimate over one coherence subspace depends only on the 
training energy and not on the number of training symbols [12], 
our scheme uses one signal space dimension in each coherence 
subspace for training and the remaining {No — 1) for communication, 
as illustrated in Fig. [TJc). We consider minimum mean squared error 
(MMSE) estimation under which the channel estimation performance 
is measured in terms of the resulting mean squared error (MSE). 

III. Ergodic Capacity of the Training-Based 
Communication Scheme 

We first characterize the coherent capacity of the wideband channel 
with perfect CSI at the receiver. The coherent capacity per dimension 
(in bps/Hz) is defined as 

E [log2 det (Ijv + HQH^)] 



Cooh (SNR) 



sup 

Q: Ti{q) < TP 



N 



where P denotes transmit power and H is the diagonal channel 
matrix in ( 113b with the diagonal elements following the block-fading 
structure. Due to the diagonal nature of H, the optimal Q is also 
diagonal. In particular, uniform power allocation Q = ■^J-n = 
SNR Ijv achieves capacity and we have 



Ccoh (SNR) = 

(a) 



E°iE[log,(l + ^|fe,| 



E 



D 

(1 + SNR|/t|^) 



(18) 



(16) where (a) follows since {hi} are i.i.d. with h. representing a generic 
random variable, A^ = TW and SNR = -^r = 3^. 

The next proposition provides a lower bound to the coherent 
capacity in the low SNR regime [11]. 

Proposition 1: The coherent capacity, Cooh (SNR), satisfies 

:>2^ 



Cooh > log2(e) (SNR -SNR" 



(19) 



Moreover the capacity converges to the lower bound in the limit of 
SNR-» 0. 

The lower bound in Proposition [T] shows that the minimum energy 
per bit necessary for reliable communication is given by ^ . = 



4 



logg(2) and the wideband slope So = 1, the two fundamental metrics 
of spectral efficiency in the wideband regime defined in [1]. 

In terms of the scaling law, Nc = ^j^ji , M > 0, as defined 
in ini . we are interested in computing the value of /i such that 
the training-based communication scheme achieves first- and second- 
order optimality. The result is summarized in the following theorem. 

Theorem 1: The average mutual information of the training-based 
scheme, with the scaling law Nc = sf^, satisfies 



Itr > log. 



SNR-O SNR 



(20) 



In particular, the first- and second-order optimality conditions are met 
if and only if ^ > 1 and fi > 3, respectively. 

Proof: Omitted for this version. See [11] for details. ■ 
It can be seen that the coherence dimension Nc plays a critical 
role in determining the capacity of the training and communication 
scheme. Both channel and signaling parameters impact Nc in sparse 
channels and for a given TmWd and {Si}, the signal space parameters 
T and W can be suitably chosen to obtain any desired value for Nc- 
Recalling the expression for Wcoh in l llSt , we note that 



Wcoh 



■>1-S2 



(r„)«2SNRi- 



(21) 



and thus Wcoh naturally scales with SNR. Using | |2U the expression 
for Nc in l ll6t becomes 



Nc = 



-Si 



pi- 



(22) 



Equating illl with l l22t leads to the following canonical relation- 
ship 

-1 + ^2 



T- 



l-Si 



(23) 



that relates the signaling parameters (T,W,P), as a function of the 
channel parameters {T,n,Wd,Si,S2) in order for the relationship (TT} 
to hold between iVc and SNR = P/W. Equations (TT} and l|^ are 
the two key equations that capture the essence of the results in this 
paper. 



A. Discussion of Results on Ergodic Capacity 

In the context of existing results in [2] that assume rich multipath 
(Si =32 — 1), Theorem[T] shows that the requirement on Tcoh is now 
the requirement on the coherence dimension Nc — TcohWcoh- Thus, 
the coherence cost is shared in both time and frequency resulting 
in significantly weakened scaling requirements for Tcoh- If we have 
Wcoh = O (W^~'^^), then the Tcoh scaling requirement reduces to 



Tcoh = Nc/Wcoh = O (W^'+'^) 



(24) 



to achieve second-order optimality. This is significantly less stringent 
than the Tcoh = O {W'^) required in the framework of [2]. 

Combining Theorem [T] with l l23t lead to scaling rules for the 
locus of points iT,W ,P) in order to achieve a desired value of (i 
(Recall ^ > 1 for first-order optimality and > 3 for second-order 
optimality). Specifically, 



log 



(T) =^ log (W'/T^^) + (^^i^) log (W) 



I -Si 



(25) 



log (P) 



It is observed that smaller Si's imply a slower scaling of T with W. 
Conversely, for any system operating at a particular T and W, i25\ 
can be used to determine the effective value of /i as 



(l-5i)log(r/c) + (l-52)log(P) 



\og{W/P) 



+ (l-<52) (26) 



T^W, 



where c : 

Note that ^ cxj as T ^ oo for sparse channels, which implies 
that first- and second-order optimality can be achieved by simply 
increasing T. This is due to the impact of sparsity in Doppler and in 
direct contrast to the case of rich multipath where the coherence re- 
quirement is independent of signaling duration. We provide numerical 
illustration of the results by considering the low SNR asymptote of 
the coherent capacity in l |19l l. The coefficients of the first- and second- 
order terms are Ai = log2(e) and A2 = — log2(e), respectively. In 
Fig. |2] we plot the numerically estimated values ci and C2 of Ai 
and A2, respectively, for the training-based communication scheme, 
which are estimated using Monte-Carlo simulations. We observe that 
for a large enough T such that \icj / > 3, the second-order constant 
C2 ^ A2 = — log2(e). Also shown in the figure is the behavior of 
the first-order constant and it is seen that ci ^ Ai for a much smaller 
value of T since all we need is > 1. 

Contrary to the traditional emphasis on peaky signaling to improve 
the spectral efficiency of non-coherent communication, our results 
imply that delay-Doppler sparsity, along with a suitable choice of 
T, W and P as in l l25t is sufficient to achieve a desired level of 
coherence with non-peaky signaling schemes. It is shown in [11] that 
the Tcoh requirements for non-peaky signals under our framework 
are still better than those for peaky training-based communication 
schemes proposed in [2] based on a rich multipath assumption. 



W = 1 GHz, P/N„ = 40 dB, T = 10"°, 5, = 0.4 




Signaling duration T (sees.) 

Fig. 2. ConYergence of the coefficients of the SNR and SNR^ fenns in 
capacity as a function ofT. 



IV. Reliability of Sparse Multipath Channels 
The reliability function of the channel is defined as [7] 

log Pe (TV, R) 



E (R) = lim sup- 

iV— *CXD 



iV 



where P^ (A'^, R) is the average probability of error over an ensemble 
of codes (random coding) in which each codeword spans the signal 
space dimensions — TW and communication takes place at 
transmission rate R. For any finite A^, while the random coding 
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(iV, R) = 



Rrr 
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(iV,-l)jf(l-(if)l-') 



R~o{l) 



0< R< Rc 



^ log ( 1 + t,r''"^'' ] p'R 0(1) j?..<i?<i?„ 



i? > i? ma a: 



1, 2JVc ; [2+(iV,-l)if-(l-{X-)l--)J 
(JVe-l){l+>7*iVcSNR) + (l->7*)]VcSNR 



(JVc-2)]VcSNR 



iVe-2)iVeSNR 
iVcSNR+iVc-1 



-(2 + fci) + ^(2 + fci)2+4(l + fci)(^-l) 



2(l + fci) 



fei = (iVc - 1)^-* (1 - (7^*)!^') 



(27) 

(28) 
(29) 
(30) 
(31) 

(32) 



exponent Er{N,R) provides a lower bound to E{R), the sphere- 
packing exponent Esp{N,R) is an upper bound to E{R) [7]. We 
recall the random coding upper bound on Pe [7] given by 

-N[Er-(N,R)] 



Pe < 

Er (N, R) = 



max max \Eo (N, p, Q) 

0<p<l Q ^ 



pR] 



Eo{N,p,Q) = 
-ilog (eh 



ly [L9(x)p(y|x,H)iipda^ 



dy 



(33) 
(34) 

(35) 



We compute the random coding error exponent in i34\ for the training 
and communication scheme described in Sec. III-CI The result is 
summarized in the following theorem. 

Theorem 2: The average probability of error for the training-based 
communication scheme is upper-bounded by, 

where E^^ {N, R) is given in ill I on the next page. Rmax in l |29t 
defines the maximum rate until which we have a non-zero error 
exponent (decaying Pe). The critical rate. Per in l l28t delineates the 
regime of the optimal parameter p* that maximizes the exponent, 
p* = 1 for < R < Per and p* is given in i32l for Per < P < 
Rmax- The constant e > and is chosen very small (e 0) so that 
the 0(1) terms are negligible. See [13] for more details. 
Note that the error exponent of the training and communication 
scheme in l l27t depends only on SNR and Nc — sfj^- 

A. Discussion of Results on Reliability 

We investigate the behavior of the random coding exponent for 
different values of p as illustrated in Fig. [3] for the given channel 
parameter set. It is observed that for any transmission rate P, there 
exists an optimum value of p = popt (P) for which the error exponent 
in ini is maximum. For any A'^, we formally define 



Popt {N, P) = arg max [Er (P, N, p)] 



(36) 



where we have written Er (P, iV, p) = [Er'' {N, R)] in (|27} explic- 
itly as a function of p to emphasize its dependance. As we traverse 
from P = to P = C'coh (as in (Tsj), the optimal operating point at 
each rate is dictated by the value of popt in l |36t and can be achieved 
by choosing T, W and P as in l |23K Furthermore, the optimizing popt 
increases monotonically as we consider larger transmission rates. In 
fact, using the results on capacity from Theorem [T] it follows that 
with p = 1, we only obtain first-order optimality and therefore the 



error exponent in Fig.[3]is non-zero only for a fraction of the coherent 
capacity, Ccoh- On the other hand, at P = Ccoh , we would require 
Popt > 3 (second-order optimal) in order to achieve a positive error 
exponent. 

In Fig. |4l we plot the error exponent in ([27} as a function of the 
parameter p for two different transmission rates. For each scenario, 
we observe that the error exponent is concave as a function of 
p and is maximized at ^ = popt- Also illustrated in the figure 
is the error exponent with perfect CSI at the receiver that is an 
upper bound to E^{N, P) in ( 127 1 and decreases monotonically with 
p. These plots reveal a fundamental learnability versus diversity 
tradeoff in sparse channels. For any rate P, when p < popt{R) (too 
little coherence), the system is in the learnability-limited regime and 
the error exponent of the training-based communication scheme is 
smaller due to poor channel estimation performance. On the other 
hand when p > popt{R) (too much coherence), we are in the 
diversity-limited regime and the hit taken by the error exponent here 
is due to the inherent reduction in the degrees of freedom (DoF) 
(or delay-Doppler diversity, D). The best exponent is obtained at 
p ~ Popt{R), which demarcates the two regimes and describes the 
optimal tradeoff. 



W = 10 Hz, P/N|, = 40dB, T^Wj^lO , 8^ = 5^ = 0.5 
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Fig. 3. Random coding error exponent versus rate for a sparse channel and 
for the training-based communication scheme. Different curves correspond to 
different values of p in the key relationship Nc = jfj]^- 
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Fig. 4. Illustration of the learnability versus diversity tradeoff for sparse 
channels. The value of /i at which the maximum is attained in each case defines 
fJ-opt (R) at the corresponding transmission rate R. 
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